Facial emotion recognition has been extensively researched in recent years due to many use cases. The most important applications are increasing human-computer interaction and helping with autism spectrum disorders. Also, in most applications, real-TIME execution is required. The model and computational resource are two main factors of the inference TIME. Hence, to propose a real-TIME method, it is required to concentrate on these two factors. In this paper, we utilized EfficientNetV2 due to its efficiency. Furthermore, we proposed a scalable method based on RESOLUTION scaling to keep the model in real-TIME in different computational resources and models. This scalable method has been implemented using a polynomial equation to find the best value of the RESOLUTION for a specific inference TIME based on our hardware and model. Thus, the main objective of this paper is to propose a scalable real-TIME method for the facial emotion recognition task using RESOLUTION scaling. Consequently, using a polynomial equation for RESOLUTION scaling, we proposed Scalable-ENV2B0 and Scalable-ENV2S based on EfficientNetV2B0 and EfficientNetV2S, respectively. According to the ultimate results on the KDEF dataset, Scalable-ENV2B0 can classify (302, 302, 3) input size images in real TIME on our hardware. Also, this model achieved an impressive 96% accuracy on KDEF, which outperforms previous real-TIME studies based on our knowledge. However, the main advantage of the proposed method is scalability, which hasn’, t been addressed in this task so far.